Skip to content

perf(gate): run the test suite once per gate — share coverage run with unit stage (closes #215)#216

Open
yuyu04 wants to merge 2 commits into
qwerfunch:developfrom
yuyu04:feature/test-run-dedup
Open

perf(gate): run the test suite once per gate — share coverage run with unit stage (closes #215)#216
yuyu04 wants to merge 2 commits into
qwerfunch:developfrom
yuyu04:feature/test-run-dedup

Conversation

@yuyu04

@yuyu04 yuyu04 commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Summary

pre-push ran the full test suite twicestage_2.1 (vitest run) and stage_2.2 (vitest run --coverage). The coverage run already runs every test, so a gate-scoped memo now shares ONE run across both stages. Closes #215.

⚠️ Stacked on #214. This branch's base commit 799126f (the incremental TS gate) isn't on develop yet — until #214 merges, this PR's diff includes it; review only the 8c3ac81 commit. Merge #214 first, then this.

A/B (cladding's own repo, worktrees)

clad check --tier=pre-push time
develop (suite ×2) ~40.4s
this PR (suite ×1) ~30.1s
delta −~10s

Soundness (verified — a gate must never pass a broken tree)

unitActionFromCoverage returns reuse-pass only for a green coverage run (exitCode === 0); every non-green case → fallback (tests-only re-run). Direct integration check on a temp fixture:

suite unit coverage
passing ✅ pass (reused, one run) ✅ pass
failing test fail fail

A coverage-threshold miss (tests pass) → coverage fails, unit's tests-only fallback passes → correct attribution. A failing test can never be reported as a unit pass via reuse.

What's in the box

  • src/stages/test-run-cache.ts (new)primeTestRunCache(on), memoizeTestRun(cwd, run) (generic, keyed by resolved cwd), and the pure unitActionFromCoverage decision.
  • unit.ts — in a primed gate, trigger + share the coverage run; reuse on green, tests-only fallback on non-green. Unprimed (standalone / MCP) → unchanged.
  • cov.ts — read the shared (memoized) coverage run.
  • clad.ts — prime around the stage loop, clear in finally.

Scope

Test selection (changed-files) is deliberately NOT done — a gate certifies the whole tree. This removes only the duplicate full run, not coverage of any test.

Feature cycle

spec/features/test-run-dedup-97abf5db.yaml (F-97abf5db, 4 ACs) → implement → blind tests (tests/stages/test-run-dedup.test.ts, 13) → clad done GREEN (cladding's own gate ran with the dedup and passed).

🤖 Generated with Claude Code

yuyu04 and others added 2 commits July 2, 2026 10:28
…(F-bfe14aac)

The TypeScript type and lint gates re-ran from scratch every gate. They now
reuse a build cache: `tsc --noEmit --incremental` (build-info file) and
`eslint --cache`, both under `.cladding/cache/` (already gitignored, so a
managed project's tree stays clean).

Measured on cladding's own repo (unchanged re-run — the local pre-commit/
pre-push loop): tsc 2.7s → 1.1s, eslint 2.5s → 0.6s (~3.4s saved).

SOUND, not a shortcut: with a stale build-info present, a newly-introduced
type error in an included file is STILL caught (verified — tsc rebuilds the
affected program slice; eslint --cache keys on file+config hash). Cold runs
(fresh CI checkout) just rebuild the cache — no regression.

Test execution is deliberately NOT scoped: a gate must certify the whole tree,
so changed-files / test-selection (unsound for a gate) was avoided. The
dominant test cost (~20s) and the unit+coverage double-run (~9.5s) are noted
as separate follow-ups.

Existing toolchain arg-pins updated; blind-authored test (incremental-gate.test.ts, 5).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… with the unit stage (F-97abf5db)

pre-push ran BOTH stage_2.1 (vitest run) and stage_2.2 (vitest run --coverage),
executing the full suite TWICE (~9.5s + ~10.7s). The coverage run already runs
every test, so the unit run was redundant.

- src/stages/test-run-cache.ts (new) — gate-scoped memo (mirrors spec cache
  F-cd0415): primeTestRunCache(on) + memoizeTestRun(cwd, run) + the pure
  unitActionFromCoverage decision.
- unit.ts — in a primed gate, trigger + share the coverage run; on GREEN reuse
  it (no second suite run); on non-green fall back to a tests-only run.
- cov.ts — read the shared (memoized) coverage run.
- clad.ts — prime around the stage loop, clear in finally.

SOUND attribution: reuse-pass is returned ONLY for a green coverage run, so a
failing test can never surface as a unit pass. Verified: a failing test reds
BOTH stages; a passing suite greens both with one run. Test SELECTION
(changed-files) intentionally avoided — a gate must run the whole suite.

Measured (cladding's own repo): clad check --tier=pre-push ~40.4s → ~30.1s (-~10s).
Blind-authored tests (tests/stages/test-run-dedup.test.ts, 13).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@yuyu04 yuyu04 force-pushed the feature/test-run-dedup branch from 8c3ac81 to 6e8a5cf Compare July 2, 2026 01:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant